Path workers, allow node operators to configure threads used#6667
Path workers, allow node operators to configure threads used#6667shortthefomo wants to merge 13 commits intoXRPLF:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds configurability for pathfinding-related concurrency by introducing a new [path_workers] config and wiring it through the JobQueue’s jtUPDATE_PF limit, so operators can scale path update / order book update throughput with available worker threads.
Changes:
- Add
[path_workers]config (default 2) with validation capped tomax(2, floor(3/4 * effective workers)). - Pass configured path worker limit into
JobQueueand enforce it via a per-job-type limit forjtUPDATE_PF. - Use the configured limit to bound
LedgerMaster’s concurrent pathfinding work dispatch.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/xrpld/core/detail/Config.cpp | Parse/validate [path_workers] against effective worker count |
| src/xrpld/core/ConfigSections.h | Add SECTION_PATH_WORKERS constant |
| src/xrpld/core/Config.h | Add PATH_WORKERS member defaulting to 2 |
| src/xrpld/app/main/Application.cpp | Compute effective worker threads and pass PATH_WORKERS into JobQueue ctor |
| src/xrpld/app/ledger/detail/LedgerMaster.cpp | Use PATH_WORKERS/JobQueue limit to cap concurrent jtUPDATE_PF dispatch |
| src/libxrpl/core/detail/JobQueue.cpp | Add ctor arg + enforce jtUPDATE_PF limit in getJobLimit() |
| include/xrpl/core/JobQueue.h | Expose new ctor parameter and getUpdatePathsJobLimit() accessor |
| cfg/xrpld-example.cfg | Document new [path_workers] section |
Comments suppressed due to low confidence (1)
include/xrpl/core/JobQueue.h:131
JobQueueis defined in a public header (include/xrpl/core/JobQueue.h) and this PR changes its constructor signature by addingupdatePathsJobLimit. That is an externally visible API/ABI change (and alibxrplchange), so the PR description/checklist claiming “No API impact” looks inaccurate. Consider updating the PR metadata and/or preserving backward compatibility (e.g., overload or default parameter) if downstream code may constructJobQueue.
JobQueue(
int threadCount,
int updatePathsJobLimit,
beast::insight::Collector::ptr const& collector,
beast::Journal journal,
Logs& logs,
perf::PerfLog& perfLog);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Maximum value is 3/4 of [workers] (with a minimum of 2). | ||
| # | ||
| # |
There was a problem hiding this comment.
The example config says the maximum is “3/4 of [workers]”, but the code enforces the limit against the effective job-queue worker count (which is auto-derived when [workers] isn’t explicitly set). To avoid misleading operators, consider documenting that the cap is based on the effective worker count (explicit [workers] or the auto-selected default).
| # Maximum value is 3/4 of [workers] (with a minimum of 2). | |
| # | |
| # | |
| # Maximum value is 3/4 of the effective JobQueue worker count (explicit | |
| # [workers] or the auto-selected default), with a minimum of 2. | |
| # |
There was a problem hiding this comment.
think this is more confusing.
High Level Overview of Change
Introduce a new configurable limit for pathfinding workers (
[path_workers]), replacing the previousjt_update_pf_limit. This allows administrators to control the maximum number of concurrently runningjtUPDATE_PFjobs in the JobQueue, which impacts path update and full order book update throughput. The limit is now capped at 3/4 of the configured[workers](rounded down), with a minimum cap of 2, to prevent system overload while allowing better utilization of available threads.Additionally, tie the number of pathfinding threads (
mPathFindThread) dynamically to the configured workers, ensuring scalability.This change does heavily impact a node CPU use when configured away from the default.
Context of Change
The pathfinding thread count is now tied to workers to avoid fixed low limits that don't scale. This improves performance on larger deployments while preventing resource exhaustion.
The core finding is that because pathfinding and order books updates are blocked behind a single thread, all other requests on todo that get stuck behind the slowest request. As xrpld stands today.
It is recommended that pathfinding nodes do run in memory mode along with this patch #6549
Every effort has been made to make sure the node is not starved when servicing large number of requests. Even so a validator should not be configuring their node to discover paths. This config is meant only for node setup to discover paths.
API Impact
libxrplchange (any change that may affectlibxrplor dependents oflibxrpl)No API impact - this is purely a configuration change.
Before / After
Before:
After:
[path_workers](default 2, max 3/4 of workers rounded down, min cap 2)Example: With
[workers] = 14, max[path_workers]is now 10 (vs. previous 1).Test Plan
To test: Set
[path_workers]to a value > 3/4 of[workers]and verify startup failure with appropriate error.Future Tasks
None - this completes the pathfinding worker configurability improvements.
*mixed up branches was working on and totally messed up #6604 so this PR is fixing that.